CHOICE-MEMORY TRADEOFF IN ALLOCATIONS 



NOGA ALON, ORI GUREL-GUREVICH, AND EYAL LUBETZKY 

Abstract. In the classical balls-and-bins paradigm, where n balls are 
placed independently and uniformly in n bins, typically the number of 
bins with at least two balls in them is B(n) and the maximum number 
of balls in a bin is Q( , '"f" ). It is well known that when each round 

^ log log n ' 

offers k independent uniform options for bins, it is possible to typically 
achieve a constant maximal load if and only if fc = n(logn). Moreover, 
it is possible whp to avoid any collisions between n/2 balls if fc > logj n. 

In this work, we extend this into the setting where only m bits of 
memory are available. We establish a tradeoff between the number of 
choices fc and the memory m, dictated by the quantity fcm/n. Roughly 
put, we show that for km ^ n one can achieve a constant maximal load, 
while for km <g n no substantial improvement can be gained over the 
case fc = 1 (i.e., a random allocation). 

For any fc — f2(log n) and m = f7(log^ n), one can achieve a constant 
load whp if km = yet the load is unbounded if km = o(n). 

Similarly, if km > Cn then n/2 balls can be allocated without any 
collisions whp, whereas for km < en there are typically fl{n) collisions. 
Furthermore, we show that the load is whp at least -, — log (n/m) — ^ j 

' ^ log fc+log log(n/m) 

particular, for fc < polylog(n), if m = n^~^ the optimal maximal load 
is 9( , ) (the same as in the case fc = 1), while m = 2n suffices 

^ log log n ^ ^ ' ' 

to ensure a constant load. Finally, we analyze non-adaptive allocation 
algorithms and give tight upper and lower bounds for their performance. 



1. Introduction 

The balls-and-bins paradigm (see, e.g., [HI IT]) describes the process 
where b balls are placed independently and uniformly at random in n bins. 
Many variants of this classical occupancy problem were intensively studied, 
having a wide range of applications in Computer Science. 

It is well-known that when b = Xn for A fixed and n ^ oo, the load 
of each bin tends to Poisson with mean A and the bins are asymptotically 
independent. In particular, for b = n, the typical number of empty bins at 
the end of the process is (1/e + o(l))n. The typical maximal load in that 
case is (1 + o(l)) 

log log n i^^l)' what follows, we say that an event 
holds with high probability (whp) if its probability tends to 1 as n — > oo. 
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The extensive study of this model in the context of load balancing was 
pioneered by the celebrated paper of Azar et. al. [3] (see the survey [19] ) 
that analyzed the effect of a choice between k independent uniform bins on 
the maximal load, in an online allocation of n balls to n bins. It was shown 
in [3] that the Greedy algorithm (choose the least loaded bin of the k) 
is optimal and achieves a maximal- load of log^logn whp, compared to a 
load of lo'giogn ^^"^ original case k = 1. Thus, k = 2 random choices 
already significantly reduce the maximal load, and as k further increases, 
the maximal load drops until it becomes constant at k = i}(logn). 

In the context of online bipartite matchings, the process of dynamically 
matching each client in a group A of size n/2 with one of k independent 
uniform resources in a group B of size n precisely corresponds to the above 
generalization of the balls-and-bins paradigm: Each ball has k options for 
a bin, and is assigned to one of them by an online algorithm that should 
avoid collisions (no two balls can share a bin). It is well known that the 
threshold for achieving a perfect matching in this case is k = log2 n: For 
k > (1 + e) log2 n, whp every client can be exclusively matched to a target 
resource, and if k < {1 — e) log2 n then U{n) requests cannot be satisfied. 

In this work, we study the above models in the presence of a constraint 
on the memory that the online algorithm has at its disposal. We find that a 
tradeoff between the choice and the memory governs the ability to achieve 
a perfect allocation as well as a constant maximal load. Surprisingly, the 
threshold separating the subcritical regime from the supercritical regime 
takes a simple form, in terms of the product of the number of choices k, and 
the size of the memory in bits m: 

• If km ^ n then one can allocate (1 — s)n balls in n bins without any 
collisions whp, and consequently achieve a load of 2 for n balls. 

• If km <^ n then any algorithm for allocating en balls whp creates 
il(n) collisions and an unbounded maximal load. 

Roughly put, when km ^ n the amount of choice and memory at hand 
suffices to guarantee an essentially best-possible performance. On the other 
hand, when km <C n, the memory is too limited to enable the algorithm to 
make use of the extra choice it has, and no substantial improvement can be 
gained over the case k = 1, where no choice is offered whatsoever. 

Note that rigorous lower bounds for space, and in particular tradeoffs be- 
tween space and performance (time, communication, etc.), have been studied 
intensively in the literature of Algorithm Analysis, and are usually highly 
nontrivial. See, e.g., [T|Hti6l[8|[9t[T2|[T3] for some notable examples. 

Our first main result establishes the exact threshold of the choice-memory 
tradeoff for achieving a constant maximal-load. As mentioned above, one 
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can verify that when there is unhmited memory, the maximal load is whp 
uniformly bounded iS k = r2(logn). Thus, assuming that k = r2(logn) is a 
prerequisite for discussing the effect of limited memory on this threshold. 

Theorem 1. Consider n halls and n bins, where each hall has k = r2(logn) 
uniform choices for bins, and m = ^l{log^ n) bits of memory are available. 
If km = Q{n), one can achieve a maximal-load 0/ 0(1) whp. Conversely, 
if km = o{n), any algorithm whp creates a load that exceeds any constant. 

Consider the case k = 0(logn). The naive algorithm for achieving a 
constant maximal-load in this setting requires roughly n bits of memory (2n 
bits of memory always suffice; see Subsection II. 3p . Surprisingly, the above 
theorem implies that 0(n/logn) bits of memory already suffice, and this is 
tight. 

As we later show, one can extend the upper bound on the load, given 
in Theorem [H to O(^) (useful when ^ < j^^^), whereas the lower 
bound tends to 00 with This further demonstrates how the quantity 
governs the value of the optimal maximal load. Indeed, Theorem [T] will 
follow from Theorems [3] and U] below, which determine that the threshold 
for a perfect matching is km = B(n). 

Again consider the case of /c = 0(logn), where an online algorithm with 
unlimited memory can achieve an 0(1) load whp. While the above theorem 
settles the memory threshold for achieving a constant load in this case, 
one can ask what the optimal maximal load would be below the threshold. 
This is answered by the next theorem, which shows that in this case, e.g., 
m = n^~^ bits of memory yield no significant improvement over an algorithm 
which makes random allocations. 

Theorem 2. Consider n/k balls and n bins, where each ball has k uniform 
choices for bins, and m > log n bits of memory are available. Then for any 
algorithm, the maximal load is at least (1 + o ( 1 ) ) iog{n/rn)+log k "^^^P- 

In particular, if m = n^~^ for some (5 > fixed and 2 < k < polylog(n), 
then the maximal load is Q( iogiogn ) whp. 

Recall that a load of order i^^y^^ is what one would obtain using a 
random allocation of n balls in n bins. The above theorem states that, 
when m = n^~^ and k < polylog(n), any algorithm would create such a load 
already after n/k rounds. 

Before describing our other results, we note that the lower bounds in our 
theorems in fact apply to a more general setting. In the original model, 
in each round the online algorithm chooses one of k uniformly chosen bins, 
thus inducing a distribution on the location of the next ball. Clearly, this 
distribution has the property that no bin has a probability larger than k/n. 
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Our theorems apply to a relaxation of the model, where the algorithm 
is allowed to dynamically choose a distribution Qt for each round t, which 
is required to satisfy the above property (i.e., ||(3t||oo < k/n). We refer to 
these distributions as strategies. 

Observe that indeed this model gives more power to the online algorithm: 
For instance, if A; = 2 (and the memory is unlimited), an algorithm in the 
relaxed model can allocate n/2 balls perfectly (by assigning probability 
to the occupied bins), whereas in the original model collisions occur already 
with Tn?/^w{n) balls whp, for any w{n) tending to oo with n. 

Furthermore, we also relax the memory constraint on the model. Instead 
of treating the algorithm as an automaton with 2*" states, we only impose 
the restriction that there are at most 2*" different strategies to choose from. 
In other words, at time t, the algorithm knows the entire history (the exact 
location of each ball so far) , and needs to choose one of its 2"^ strategies for 
the next round. In this sense, our lower bounds are for the case of limited 
communication complexity rather than limited space complexity. 

We note that all our bounds remain valid when each round offers k choices 
with repetitions. 

1.1. Tradeoff for perfect matching. The next two theorems address the 
threshold for achieving a perfect matching when allocating (1 — 5)n balls in 
n bins for some fixed < (5 < 1 (note that for (5 = 0, even with unlimited 
memory, one needs k = ^{n) choices to avoid collisions whp). The upper 
and lower bounds obtained for this threshold are tight up to a multiplicative 
constant, and again pinpoint its location at km = 0(n). The constants 
below were chosen to simplify the proofs and could be optimized. 

Theorem 3. For 5 > fixed, consider (1 — 5)n halls and n bins: Each hall 
has k uniform choices for bins, and there are m > log n hits of memory. If 

km < en for some small constant e > , 

then any algorithm has il.{n) collisions whp. 

Furthermore, the maximal load is whp ri(loglog(^). 

Theorem 4. For 6 > fixed, consider (1 — 6)n balls and n bins, where each 
hall has k uniform choices for bins, and m hits of memory are available. 
The following holds for any k > (3/5) log n and m > log n ■ log2 log n. If 

km > Cn for some C = C{5) > , 

then a perfect allocation (no collisions) can be achieved whp. 

In light of the above, for any value of A;, the online allocation algorithm 
given by Theorem [4] is optimal with respect to its memory requirements. 
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1.2. Non-adaptive algorithms. In the non-adaptive case the algorithm is 
again allowed to choose a fixed (possibly randomized) strategy for selecting 
the placement of ball number t in one of the k possible randomly chosen 
bins given in step t. Therefore, each such algorithm consists of a sequence 
Qi, Q2, • • • ) Qn of n pre-determined strategies, where Qt is the strategy for 
selecting the bin in step number t. 

Here we show that even if k = n ^°^'°.^" , the maximum load is whp at 

least (1 — 0(1)) lo^^gy^ ; that is, it is essentially as large as in the case k = 1. 
It is also possible to obtain tight bounds for larger values of k. We illustrate 
this by considering the case k = 0(n). 

Theorem 5. Consider the problem of allocating n halls into n bins, where 
each ball has k uniform choices for bins, using a non-adaptive algorithm. 

(i) The maximum load in any non-adaptive algorithm with k < n ^°^^°^" 

IS whp at least {l-o{l))j^^. 
(a) Fix < a < 1. The maximum load in any non-adaptive algorithm 
with k = an is whp n{^\ogn) . This is tight, that is, there exists a 
non-adaptive algorithm with k = an so that the maximum load in it is 
0{yJ\ogn) whp. 

1.3. Range of parameters. In the above theorems and throughout the 
paper, the parameter k may assume values up to n. As for the memory, one 
may naivly use n log2 L bits to store the status of n bins, each containing at 
most L balls. The next observation shows that the log2 L factor is redundant: 

Observation. At most n + h — \ hits of memory suffice to keep track of the 
number of halls in each bin when allocating h halls in n bins. 

Indeed, one can maintain the number of balls in each bin using a vector in 
{0, 1}"+^-!^ where 1-bits stand for separators between the bins. In light of 
this, the original case of unlimited memory corresponds to the case m = 2n. 

1.4. Main techniques. The key argument in the lower bound on the per- 
formance of the algorithm with limited memory is analyzing the expected 
number of new collisions that a given step introduces. We wish to estimate 
this value with an error probability smaller than 2~™', so it would hold whp 
for all of the 2*" possible strategies for this step. 

To this end, we apply a large deviation inequality, which relates the sum 
of a sequence of dependent random variables (Xj) with the sum of their 
"predictions" (y^), where Yi is the expectation of Xi given the history up to 
time i. Proposition 12 . 1 1 essentially shows that if the sum of the predictions Yi 
is large (exceeds some t), then so is the sum of the actual random variables 
Xi, except with probability exp(— c£). In the application, the variable Xi 
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measures the number of new collisions introduced by the i-th ball, and Yi is 
determined by the strategy Qi and the history so far. 

The key ingredient in proving this proposition is a Bernstein-Kolmogorov 
type inequality for martingales, which appears in a paper of Freedman [11] 
from 1975, and bounds the probability of deviation of a martingale in terms 
of its cumulative variance. We reproduce its elegant proof for completeness. 
Crucially, that theorem does not require a uniform bound on individual 
variances (such as the one that appears in standard versions of Azuma- 
Hoeffding), and rather treats them as random variables. Consequently, the 
quality of our estimate in Proposition 12.11 is unaffected by the number of 
random variables involved. 

For the upper bounds, the algorithm essentially partitions the bins into 
blocks, where for different blocks it maintains an accounting of the occupied 
bins with varying resolution. Once a block exceeds a certain threshold of 
occupied bins, it is discarded and a new block takes its place. 

1.5. Organization. This paper is organized as follows. In Section [2] we 
prove the large deviation inequality (Proposition 12. ip . Section [3] contains the 
lower bounds on the collisions and load, thus proving Theorem [3l Section H] 
provides algorithms for achieving a perfect-matching and for achieving a 
constant load, respectively proving Theorem |4] and completing the proof of 
Theorem [TJ In Section [5] we extend the analysis of the lower bound to prove 
Theorem [2l Section [6] discusses non-adaptive allocations, and contains the 
proof of Theorem m Finally, Section [7] is devoted to concluding remarks. 

Remark. The problem of balanced allocations with limited memory was 
proposed to us by Itai Benjamini. In a recent independent work, Benjamini 
and Makarychev [7j studied the special case of the problem for k = 2 (i.e., 
when there are two choices for bins at each round). While our focus was 
mainly the regime k = ri(logn) (where one can readily achieve a constant 
maximal load when there is unlimited memory), our results also apply for 
smaller values of k. Namely, as a by-product we improve the lower bound of 
[7j by a factor of 2, as well as extend it from = 2 to any k < polylog(n). 

2. A LARGE DEVIATION INEQUALITY 

This section contains a large deviation result, which will later be one of the 
key ingredients in proving our lower bounds for the load. Our proof will rely 
on a Bernstein-Kolmogorov type inequality of Freedman [T3], which extends 
the standard Azuma-Hoeffding martingale concentration inequality. Given a 
sequence of bounded (possibly dependent) random variables (Xj) adapted to 
some filter {Ti), one can consider the sequence {Yt) where = E [Xi \ J^i-i], 
which can be viewed as predictions for the (Xj)-s. The following proposition 
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essentially says that, if the sum of the predictions is large, so is the sum 
of the actual variables Xi . 

Proposition 2.1. Let (Xi) be a sequence of random variables adapted to 
the filter (J^i) so that < Xi < M for all i, and let Yi = E[Xj | J-i-i\. Then 

f(| ^^-1 Yyi>h^ for some t^ <exp(-^ + 2 

Proof. As mentioned above, the proof hinges on a tail-inequality for sums 
of random variables, which appears in the work of Freedman [T3] from 1975 
(see also [2U] ) , and extends such inequalities of Bernstein and Kolmogorov to 
the setting of martingales. See [T3| and the references therein for more back- 
ground on these inequalities, as well as [lOj for similar martingale estimates. 
We include the short proof of Theorem 12.21 for completeness. 

Theorem 2.2 ([14i Theorem 1.6]). Let {Sq, Si, . . .) be a martingale with 
respect to the filter {J^i). Suppose that S'j+i — Si < M for all i, and write 
Vt = X]i=i Var(S'j I J-'i-i). Then for any s,v > we have 

\ 

¥ {Sn > So + s , Vn <v for some n) < exp -— — 

[ 2[v + Ms) 

Proof. Without loss of generality, suppose Sq = 0, and put Xi = Si — Si-i. 
Re-scaling Sn by M, it clearly suffices to treat the case Xi < 1. Set 
t t 
Vt = ^Var(5, I J^i.i) = | , 

and for some A > to be specified later, define 

Zt = exp (xSt - (e^ - 1 - X)Vt 



The next calculation will show that {Zt) is a super-martingale with respect 
to the filter (J-'t). First notice that the function 

m = "^~l~" forz/O , /(0)^i 

2 

is monotone increasing (as f'{z) > for all z ^ 0), and in particular, 
f{Xz) < /(A) for all z < 1. Rearranging, 



exp(Az) < 1 + Xz + [e^ - 1 - Xj z^ for all z < 1 . 
Now, since Xi < 1 and E[Xj | J-i-i] = for all i, it follows that 
E [cxp(AX,) I J^i^i] < 1 + (^e^ - 1 - a) E [Xf \ J^i^i] 

< exp ( (e^ - 1 - a) E [Xf \ Ti^i 



NOGA ALON, ORI GUREL-GUREVICH, AND EYAL LUBETZKY 



By definition, this precisely says that E[Zj | J-i-i] < ^j-i- That is, (Zt) 
is a super-martingale, and hence by the Optional Stopping Theorem so is 
(ZrAn), where n is some integer and r = min{t : St > s}. In particular, 

lE-^TAn < Zq = I , 

and (noticing that Vt+i > Vt for all t) Markov's inequality next implies that 
IJ {St > s, Vt <vU < exp -As + (e^ - 1 - X)v 

t<n ^ 

A choice of A = log (^) > ^ + ^ (^)^ therefore yields 



\^{St >s,Vt< v) \ < exp [s-(s + v)log 



,s + v-, 



V 



< exp 



2{s + v) 



and taking a limit over n concludes the proof. 



Remark. Note that Theorem 12.21 generalizes the well-known version of 
the Azuma-Hoeffding inequality, where each of the terms Var(Xj | J-i-i) 
is bounded by some constant af (cf., e.g., p[8]). 

We now wish to infer Proposition 12.11 from Theorem 12. 2i To this end, 
define 

t t 
Zt = Y,Yi-Xi, Vt = Y, Var(Z^ | J'^-l) , 
1=1 1=1 
and observe that (Zt) is a martingale by the definition Yi = E[Xj | J-'i-i]. 
Moreover, as the Xj-s are uniformly bounded, so are the increments of Zt: 

\Z,-Zi_i\ = \Y^-Xi\ <M . 

Furthermore, crucially, the variances of the increments are bounded as well 
in terms of the conditional expectations: 

Var (y, - X, I = Var (X, | < M • E [X, | = M • , 

giving that Vt<MY!^=lyi■ 

Finally, for any integer j > 1 let Aj denote the event 

Aj = i^^Xi<\^Yi and jh <^Yi<{j + for some t j . 

^ i<t i<t i<t ' 

Note that the event Aj implies that Zt > jh/2. Hence, applying Theorem l2.2l 
to the martingale (Zt) along with its cumulative variances (Vt) we now get 



F{Aj) <F(Zt> ^h,Vt< (i + l)hM) < exp 



2((i + l)/iM + M(|/i)) 



exp 



4(3j + 2) 



(h/M) 



< exp 



20M^ 
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Summing over the values of j, we obtain that if /i > 20M then 

F (U,>i^,) < — ^ exp ( —) < exp f —j + 1 

while for h < 20M the above inequality holds trivially. Hence, for all /i > 0, 
p(3t:{5]X,<i^y,and ^ >/,}]< exp (- ^ + l) . (2.1) 

^ i<t i<t i<t ^ 

To complete the proof of the proposition, we repeat the above analysis for 

t t 
Z[ ^-Zt = Y,Xi-Y, , Vl = Var(Z,' | = Vt . 

1=1 i=l 

Clearly, we again have \Z'- - Z'-^^\ < M and V/ < MJ2l=i Yi- Defining 
A^. = N < I and jh < < (i + for some t \ , 

^ i<t i<t i<t ^ 

it follows that the event yl^- implies that Z'^ > ^ Y-i > jh/2. Therefore, as 
before, we have that 

F{A'j) < P (z; > Ih , < {j + l)/iM) < exp ( - 

and thus for all /i > 

P(3t:{j;y.<|5;x,and ^ >/,}]< exp (- ^ + l) . (2.2) 

^ i<t i<t i<t ^ 

Summing the probabilities in (|2.1|) and (|2.2p yields the desired result. ■ 

We note that essentially the same proof yields the following generalization 
of Proposition 12.11 As before, the constants can be optimized. 

Proposition 2.3. Let (Xj) and (1^) he as given in Proposition \2.1\ Then 
for any < e < 

/ ^ ^ — 1 > e and Yi > h\ for some t\ < exp ( — - + 2 

I Ei<ty^ ~ ~ ^ J - ' \ bM 

Remark 2.4. The statements of Proposition 12.11 and Proposition 12.31 hold 
also in conjunction with any stopping time r adapted to the filter {Ti). 
That is, we get the same bound on the probability of the mentioned event 
happening at any time t < t. This follows easily, for instance, by altering the 
sequence of increments to be identically after r. Such statements become 
useful when the uniform bound on the increments is only valid before r. 
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3. Lower bounds on the collisions and load 

In this section we prove Theorem [3] as weh as the lower bound in The- 
orem [H by showing that if the quantity km/n is suitably small, then any 
allocation would necessarily produce nearly linearly many bins with arbi- 
trarily large load. 

The main ingredient in the proof is a bound for the number of collisions, 
i.e., pairs of balls that share a bin, defined next. Let Nt[i) denote the number 
of balls in bin i after performing t rounds; the number of collisions at time 
t is then 

1=1 ^ ^ 

The following theorem provides a lower bound on Col2(t) for t > c ■ km for 
some absolute c > 0. 

Theorem 3.1. Consider n balls and n bins, where each ball has k uniform 
choices for bins, and m > log n bits of memory are available, 
(i) For all t > 500 • km we have 

ECoh{t) > tV(9n) . 

(a) Furthermore, with probability 1 — 0{n~'^), for all L = L{n) and any 
t > { 500km V 30i/ Ln log n ) , either the maximal load is at least L or 

Cohit) > tV(16n) . 

Note that the main statement of Theorem [3] immediately follows from the 
above theorem, by choosing t = {1 — 5)n and L = ^Jn. Indeed, recalling 
the assumption in Theorem [3] that m > logn, we obtain that, except with 
probability 0(n~^), either the algorithm creates a load of -y/n, or it has 
Col2(n) > n. Observing that a load of L immediately induces (g) 

collisions, we deduce that either way there are at least Vt{n) collisions whp. 

We next prove Theorem 13. It the statement of Theorem [3] on unbounded 
maximal load will follow from an iterative application of a more general form 
of this theorem (namely, Theorem 13. 4p . which appears in Subsection 13. 1[ 

Proof of Theorem 13. 1[ As noted in the Introduction, we relax the model 
by allowing the algorithm to choose any distribution = (^(1), . . . 
for the location of the next ball, as long as it satisfies ||ju||oo < kjn. 

We also relax the memory constraint as follows. The algorithm has a 
pool of at most 2™" different strategies, and may choose any of them at a 
given step without any restriction (basing its dynamic decision on the entire 
history) . 

To summarize, the algorithm has a pool of at most 2"^ strategies, all of 
which have an L°°-norm of at most kjn. In each given round, it adaptively 
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chooses a strategy /i from this pool based on the entire history, and a ball 
then falls to a bin distributed according to /x. 

The outline of the proof is as follows: consider the sequence Qi, . . . , Qn, 
chosen adaptively out of the pool of 2™ of strategies. The large deviation 
inequality of Section [2] (Proposition [2TT]) will enable us to show the following: 
The expected number of collisions encountered in the above process is well 
approximated by the expected number of collisions between n independent 
balls, placed according to Qi,. . . ,Qn (i-e., equivalent to the result of the 
non-adaptive algorithm with strategies Qi, . . . , Qn)- 

Having reduced the problem to the analysis of a non-adaptive algorithm, 
we may then derive a lower bound on ECol2(t) by analyzing the structure 
of the above strategies. This bound is then translated to a bound on Col2(t) 
using another application of the large deviation inequality of Proposition [2Tl 

Let I' = (z^(l),... be an arbitrary probability distribution on [n] 

satisfying ||i^||oo < k/n, and denote by Qs = (Qs(l), . . . , Qs{n)) the strategy 
of the algorithm at time s. It will be convenient from time to time to treat 
these distributions as vectors in M". 

By the above discussion, Qs is a random variable whose values belong to 
some a-priori set {^ui, . . . , ^2^}- We further let Jg denote the actual position 
of the ball at time s (drawn according to the distribution Qg). 

Given the strategy at time s, let Xg denote the probability of a collision 
between v and Qs given J^, i.e., that the ball that is distributed according 
to V will collide with the one that arrived in time s. We let Vs be the inner 
product of Qs and v, which measures the expectation of these collisions. 

n 

1=1 

Further define the cumulative sums of f ^ and follows: 

s=l 
s=l 

To motivate these definitions, notice that given the history up to time s — 1 
and any possible strategy for the next round, we have 

s—l n n 

= = <s:Jr = i]\ = EK^)iV._i(i) , 

i=l 1=1 i=l 
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and so X^J^^ is the expected number of collisions that will be contributed 
by the ball Jg ~ Qs given the entire history J^s-\- Summing over s, we have 
that 

t 

ECol2(t)=E[j;xf_!i] , 

s=l 

thus estimating the quantities X^^^ will provide a bound on the expected 
number of collisions. Our aim in the next lemma is to show that whp, when- 
ever V^*?.\ is large, so is -^^i- This will reduce the problem to the analysis of 
the quantities V"^*^"^, which are deterministic functions of Qi, • • • , Qn- This is 
the main conceptual ingredient in the lower bound, and its proof will follow 
directly from the large deviation estimate given in Proposition 12. 1[ 

Lemma 3.2. Let Qi, . . . , Qn be a sequence of strategies adapted to the filter 
{J-i), and let X^ and Vg he defined as above. Then with probability at 
least 1 — 0{e~^^), for every u e {^ui . . . ,112™-} md every s we have that 
> 100||i^||oom implies X^ > 

Proof. Before describing the proof, we wish to emphasize a delicate point. 
The lemma holds for any sequence of strategies Qi,Q2, ■ ■ ■ , Qn (each Qi is 
an arbitrary function of J-'i-i). No restrictions are made here on the way 
each such Qi is produced (e.g., it does not even need to belong to the pool 
of 2™ strategies), as long as it satisfies ||Qi||oo < k/n. The reason that such 
a general statement is possible is the following: Once we specify how each 
Qi is determined from J-i-i (this can involve extra random bits, in case the 
adaptive algorithm is randomized), the process of exposing the positions of 
the balls, Ji ~ Qi, defines a martingale. Hence, for each fixed z/, we would be 
able to show that the desired event occurs except with probability 0{e~^"^). 
A union bound over the strategies u (which, crucially, do belong to the pool 
of size 2™) will then complete the proof. 

Fix a strategy u out of the pool of 2'" possible strategies, and recall the 
definitions of Xg and v^, according to which 

< < ||i/||oo , < = EK|.F,_i]. 

By applying Proposition 12 . II to the sequence (Xg) (with the cumulative sums 
X^ and cumulative conditional expectations V^^), we obtain that for all h, 

F{X^ < y//2 , > /i for some s) < exp ( - + 2 

Thus, taking h = 100 1| 

F{X^ < , > 100||i/||oom for some s) < exp (-5m + 2) . 

Summing over the pool of at most 2"* predetermined strategies ly completes 
the proof. ■ 
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Having shown that is well approximated by , and recalling that we 
are interested in estimating X^J^^, we now turn our attention to the possible 
values oiV^fi- 



Claim 3.3. For any sequence of strategies Qi, ■ ■ ■ ,Qt we have that 

E^i 



2n 

s=l 



Proof. By our definitions, for the strategies Qi, ■ ■ ■ ,Qt we have 

t t s—l n 

s=l s=l r=l i=l r<s<t 

t „ t 



2 

i=l '- s=l s=l 



(3.1) 



Recalling the definition of the strategies Qi, we have that 

J < Qs(i) < k/n for all i and s, 

I YJi=i Qs{i) = 1 for all s. 

Therefore, 

" * t " * hi 



1 = 1 s = l «=1 s = l 



On the other hand, by Cauchy- Schwartz, 

n t „ ^ n t 



2 t^ 



i=l s=l i=l s=l 

Plugging these two estimates in (j3.ip we deduce that 



t 



E^i 



> 



t(t - k) 



as required. 



While the above claim tells us that the average size of is fairly large 
(has order at least (t — k)/n), we wish to obtain bounds corresponding 
to individual distributions Qs- As we next show, this sum indeed enjoys 
a significant contribution from indices s where = fl{km/n). More 

precisely, setting h = lOOkm/n, we claim that for large enough n, 

E^.^~ii{e,>M^i^- (3.2) 

s=l 

To see this, observe that if 

t > to = 5hn = 500km , 
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then 



s=l 



Combining this with Claim [3^ (while noting that = (1 — o(l))|^) 

yields (j3.2p for any sufficiently large n. 

We may now apply Lemma 13.2^ and obtain that, except with probability 
0(e-'^'"), whenever V^_\ > h we have xf_^^ > lV^_\, and so 

s=l s=l 

Altogether, since xf_fi > 0, we infer that 



ECol2(t) = e[ Vx2:J > — (1 - 0(n-^)) > — for all t > to , (3.4) 
y/-^ J 8n ^ ' 9n 

s=l 



where the last inequality holds for large enough n. This proves Part ^ of 
Theorem [O 

It remains to establish concentration for Col2(t) under the additional 
assumption that t > 30^/IynTogn for some L = L{n). First, set the following 
stopping-time for reaching a maximal-load of L: 

Tl = min |t : max A'f(j) > l| . 

Next, recall that 

t 

s=l 

and notice that 

n 

E[7V,„i(J,) I Ts-i] =Y,Qs{i)Ns^i{i) = Xf^i • 

i=l 

Therefore, we may apply our large deviation estimate given in Section [2] 
(Proposition 12. 1 1 ) . combined with the stopping-time tl (see Remark 12. 4p : 

• The sequence of increments is {Ns-i{Js))- 

• The sequence of conditional expectations is (X^^^). 

• The bound on the increments is L, as Ns-i{Js) < maxj Ns-i{i) < L 
for all s < Tl. 
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It follows that 

fU Coh{t) < ^ and ^Xfj^ > ^} for some t < tl 

^ s<t s<t 

where the last inequality is by the assumption t > 30\/ Ln logn. Finally, 
by (j3.3p . we also have that '^s<t-^?-i — ^^/(8n) for all t > to, except with 
probability 0{n~^). Combining these two statements, we deduce that for 
any t > {to V 30^/Lnlogn), 

P(Col2(t) < ^ , TL>t)=0{n-') , 
concluding the proof of Theorem 13. 1[ ■ 

3.1. Boosting the subcritical regime to unbounded maximal load. 

While Theorem 13.11 given above provides a careful analysis for the number 
of 2-collisions, i.e., pairs of balls sharing a bin, one can iteratively apply this 
theorem, with very few modifications, in order to obtain that the number of 
(7-collisions (a set of q balls sharing a bin) has order Q(n^~°^^^) whp. The 
proof of this result hinges on Theorem 13.41 below, which is a generalization 
of Theorem 13.11 

Recall that in the relaxed model studied so far, at any given time t the 
algorithm adaptively selects a strategy Qt (based on the entire history J^t-i), 
after which a ball is positioned in a bin Jt ^ Qt- We now introduce an extra 
set of random variables, in the form of a sequence of increasing subsets, 
Ai C ... C An C [n]. The set At is determined by J^t-i, and has the 
following effect: If Jt £ At, we add a ball to this bin as usual, whereas 
if Jt ^ At, we ignore this ball (all bins remain unchanged). That is, the 
number of balls in bin i at time t is now given by 

t 

Nt{^) = Y,hJs=^}Us{^) : 
s=l 

and as before we are interested in a lower bound for the number of collisions: 

com*)^E(T)- 

t=i ^ ^ 

The idea here is that, in the application, the set At will consists of the 
bins that already contain £ balls at time t. As such, they indeed form an 
increasing sequence of subsets determined by (J-'i). In this case, any collision 
corresponds to 2 balls placed in some bin which already has i other balls, 
and thus immediately implies a load of £ + 2. 
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Theorem 3.4. Consider the following balls and bins setting: 

(1) The online adaptive algorithm has a pool of 2™ possible strategies, 
where each strategy /x satisfies ||/u||oo < k/n. The algorithm selects a 
(random) sequence of strategies Qi, . . . , Qn adapted to the filter {J^i). 

(2) Let Ai C ... C An C [n] denote a random increasing sequence of 
subsets adapted to the filter (Ti), i.e., Ai is determined by Ti-\. 

(3) There are n rounds, where in round t a new potential location for a 
ball is chosen according to Qt- If this location belongs to At, a ball 
is positioned there (otherwise, nothing happens). 

Define T = Yl^=i Qsi^s)- Then for any L = L{n), 

/ \ 

P(T > kmny J Ln\ogn) , CoUn) < , maxA^„(j) < L ) < 0(n~^). 

\ Ion j ) 

Proof. As the proof follows the same arguments of Theorem 13. H we restrict 
our attention to describing the modifications that are required for the new 
statement to hold. 

Define the following sub-distribution of Qg with respect to A^: 

Q's = Qs^As ■ 

As before, given Qg, the strategy at time s, define the following parameters: 

n 
i=l 

and let the cumulative sums of and Xg be denoted by: 

s=l s=l 

We claim that a statement analogous to that of Lemma 13.21 holds as is 
with respect to the above definitions, for any choice of increasing subsets 
j4i C . . . C An (adapted to the filter (J-'i)). As we soon argue, the martingale 
concentration argument is valid without any changes, and the only delicate 
point is the identity of the target strategy v. 

Lemma 3.5. Let Qi, . . . , Qn and C . . . An be strategies and subsets resp., 
adapted to the filter {J-i), and let and be defined as above. Then with 
probability at least 1 — 0(e~^™), for every v G {^ui . . . , /i2'"} 0''n.d every s we 
have that Vg > 100||z^||oo?7i implies Xg > Vg /2, where v' = i^Ias+i- 

Proof. Let be a strategy. Previously (in the proof of Lemma 13. 2p . we 
compared Xg to Vg using the large deviation inequality of Section [5J Now, 
for each s, our designated z^' is a function of and Ag+i, and hence depends 
on J^g. In particular, there are potentially more than 2"^ different strategies 
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to consider as v' , destroying our union bound! The crucial observation that 
resolves this issue is the following: 

Observation 3.6. Let r > s and let v be a strategy. Then = Vg' and 
Xg = Xg' for any increasing sequence Ai, . . . , A^, where v' = vlAr- 

To see this, first consider Xg and Xg'. If xj' for some 1 < i < s had a 
non-zero contribution to Xg, then by definition Jj G Aj. Since Ai C A^, we 
also have Jj S Ar, and so = z/(Jj)l^^(Jj) = x^. The statement = Vg 
now follows from the fact that Vg is the sum of = IE[xJ' | J-i-i]- 

Using the above observation, it now suffices to prove the statement of 
Lemma 13.51 directlv on the strategies v (rather than on v'). Hence, the only 
difference between this setting and that of Lemma 13.21 is that here some of 
the rounds are forfeited (as reflected in the new definition of the Vg^s). The 
proof of Lemma 13.21 therefore holds unchanged for this case. ■ 

Similarly, the following claim is the analogue of Claim 13.31 with t (the 
number of balls in the original version) replaced by T = Q'si^) (the 

expected number of balls actually positioned). 

Claim 3.7. For any Qi, . . . , Qn and C . . . C An we have that 

^ '-^ - 2n 

s=l 

The proof of the above claim follows from the exact same argument as 
in Claim 13.31 Notice that the bound there, given as a function of t, was 
actually a bound in terms of X^^^i <3s(i), and so replacing Qg by Q'g 
yields the desired bound function of T. 

With this in mind, set h = lOOkm/n and note that, clearly, 

n 

V 1 o' <hn . 

s=l 

Therefore, if 

to = nVbh < 25V kmn , 

then 

hn < — for any T >tQ , 
5n 

and so for such T and any large enough n 

s-i ^vZ\>h} - 4n 
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By following the next arguments from the proof of Theorem 13.1^ it now 
follows that, as long as T > to; 

ECohin) = e[^x5J > ^(1 - 0(n-4)) > ^ . 

s=l 

Similarly, using the argument as in the proof of Theorem 13. H which de- 
fines the stopping-time tl and applies Proposition 12.11 on the sequence of 
increments given by 

Col2(t) - Coh{t - 1) = Ns-i{Js)1aAJs) , 
we deduce that, if T > (to V 30-v/Xnlogn) then 

p(Col2(n) < ^ , TL > n) = 0(n-4) , 

as required. ■ 

We next show how to infer the results regarding an unbounded maxi- 
mal load from Theorem 13.41 For each integer i = 0,1,2,..., we define the 
increasing sequence (At) by: 

Al^{ie [n] : Nt{i) > £} . 

Further define 

n 
s=l 

which is the expected number of balls that are placed in bins which already 
hold at least i balls. The proof will follow from an inductive argument, 
which bounds the value of T^+i in terms of T^. 

For some L = L{n) to be specified later, our bounds will be meaningful 
as long as the maximal load is at most L, and 



Ti > 30{V kmn V y^Lnlogn) . (3.5) 
Using Theorem 13.41 we will show that, if (|3.5p holds then 

To this end, define 

that is, Ri denotes the number of collisions between all pairs of balls that 
were placed in a bin, that already held at least i balls. 
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To infer ()3.6p . apply Theorem 13.41 with respect to the subsets (Al). The 
assumption (j3.5p imphes that, except with probabihty 0(n~^), either the 
load is at least L, or 

16n 

Notice that any ball that is placed in a bin, which contains at most L 
balls, can contribute at most L collisions to the count of Ri. Therefore, if 
the maximal load is less than L, the following holds: The number of balls 
placed in bins that already contain at least i balls, is at least 

Ri/L > , with probability 1 - Oin"^) . (3.7) 

Recalling that T^+i is the expected number of such balls, we infer that 

T„>(l-0(„-))^£>^. 

where the last inequality holds for large enough n (with room to spare). 

This establishes that (|3.5|) implies (j3.6p . Since by definition Tg = n, we 
deduce that the decreasing series (Tq,Ti, . . .) satisfies 

Tl 

> (^20^)2^+1-1 satisfies ([33]). 

Rearranging, it follows that, in particular, l\3.5\) is satisfied if 



2<_i _ I n in 



It is now easy to verify that, for any fixed e > 0, choosing 
L = £=(l-e)log,log(£-) 

satisfies (j3.8p for large enough n. By (|3.7p . we can then infer that i?^ > 
with probability 1 — 0{n~'^), hence the maximal load is at least I. This 
concludes the proof of Theorem [3l ■ 



4. Algorithms for perfect matching and constant load 

In this section, we prove Theorem|3]by providing an algorithm that avoids 
collisions whp using only 0{n/k) bits of memory, which is the minimum 
possible by Theorem [3l The case km = Q{n) of Theorem [1] will then follow 
from repeated applications of this algorithm. 
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Perfect allocation algorithm for (1 — S)n balls 

1. For £ = [ \ ^j2\ J ) partition the bins into contiguous blocks Bi,... ,3^ 
each comprising [m/2j bins. Ignore any remaining unused bins. 

2. Set d = |'log2 logn)] , and define the arrays Aq, . . . , A^^i: 

• Aj comprises 2^ contiguous blocks (a total of ~ 2^~^m bins). 

• For each contiguous (non-overlapping) 4^ -tuple of bins in Aj, we 
keep a single bit that holds whether any of its bins is occupied. 

• All blocks currently or previously used are contiguous. 

3. Repeat the following procedure until exhausting all rounds: 

• Let j be the minimal integer so that a bin of Aj, marked as 
empty, appears in the current selection of k bins. If no such j 
exists, the algorithm announces failure. 

• Allocate the ball into this bin, and mark its 4^ -tuple as occupied. 

• If the fraction of empty 4-' -tuples remaining in Aj just dropped 
below 6/2, relocate the array Aj to a fresh contiguous set of 
empty 2^ blocks (immediately beyond the last allocated block). 
If there are less than 2^ available new blocks, the algorithm fails. 

4. Once (1 — S)n rounds are performed, the algorithm stops. 



We proceed to verify the validity of the algorithm in stages: First, we 
discuss a more basic version of the algorithm suited for the case where 
km = O(nlogn); then, we examine an intermediate version which extends 
the range of the parameters to kmlogm = ^(nlogn); finally, we study the 
actual algorithm, which features the tight requirement km = r2(n). 

Throughout the proof of the algorithm, assume that in each round we are 
presented with k independent uniform indices of bins, possibly with repeti- 
tions. Clearly, an upper bound for the maximal load in this relaxed model 
translates into one for the original model {k choices without repetitions). 



4.1. Basic version of the algorithm. We begin with a description and a 
proof of a simpler version of the above algorithm, suited for the case where 

km > (3/(5)nlogn . (4.1) 

This version will serve as the base for the analysis. For simplicity, assume 
first that ml n. 
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Basic version of allocation algorithm for (1 - 6)n balls 

1. Let Bi, . . . , Bg be an arbitrary partition of the n bins into £ = n/m 
blocks, each containing m bins. Put r = [(1 — 6)m\. 

2. Throughout stage j £ [i] , only the m bins belonging to Bj are tracked. 
At the beginning of the stage, all bins in the block are marked empty. 

3. Stage j comprises r rounds, in each of which: 

• The algorithm attempts to place a ball in an arbitrary empty 
bin of Bj if possible. 

• If no empty bin of Bj is offered, the algorithm declares failure. 

4. Once (1 — 5)n rounds are performed, the algorithm stops. 

To verify that this algorithm indeed produces a perfect allocation whp, 
examine a specific round of stage j, and condition on the event that so far 
the algorithm did not fail. In particular, its accounting of which bins are 
occupied in Bj is accurate, and at least m — r = (5 — o(l))m bins in Bj are 
still empty (notice that by our assumption m = O(logn), and so m — > oo 
with n). 

Let MisSj denote the event that the next ball precludes all of the empty 
bins of Bj in its k choices, we have 

P(Miss,) < (l - < e-(^-°(i))^ < , (4.2) 

by assumption ()4.ip . A union bound over the n rounds now yields (with 
room to spare) that the algorithm succeeds whp. 

The case where m does not divide n is treated similarly: Set i = [■[7^727 -I ' 
and partition the bins into blocks that now hold [m/2\ bins each, except 
for the final block Bi which would have between [m/2j and m — 1 bins. As 
before, in stage j we attempt to allocate [(1 — (5)|-Bj|J balls into Bj, while 
relying on the property that Bj has at least {5 — o{l))\Bj\ > {5 — o(l))m/2 
empty bins. This gives 

P(Miss,-) < e-(^-°(i»^ < , 

as required. 

4.2. Intermediate version of the algorithm. We now wish to adapt the 
above algorithm to the following case: 

o n 

kmlog2m > (20/5) log (5/(5)n log n , log n < m < . (4.3) 

logn 

Notice that if m > n^, the above requirement is essentially that 

km = il{n/e) . 
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The full version of the algorithm will eliminate this dependency on £. 



Intermediate version of allocation algorithm for {l — 6)n balls 

1. For i = L irn/2] -I ' partition the bins into contiguous blocks Bi, . . . , Bi 
each comprising [m/2\ bins. Ignore any remaining unused bins. 

2. Set d = log2 rn\ , and define the arrays Aq, . . . , A^-i: 

• Aj is one of the blocks Bi, . . . , B^. 

• For each contiguous (non-overlapping) 2-' -tuple of bins in Aj, we 
keep a single bit that holds whether any of its bins is occupied. 

3. Repeat the following procedure until exhausting all rounds: 

• Let j be the minimal integer so that a bin of Aj, marked as 
empty, appears in the current selection of k bins. If no such j 
exists, the algorithm announces failure. 

• Allocate the ball into this bin, and mark its 2-' -tuple as occupied. 

• If the fraction of empty 2-^ -tuples remaining in Aj just dropped 
below 5/2, relocate the array Aj to a fresh block (immediately 
beyond the last allocated block). If no such block is found, the 
algorithm fails. 

4. Once (1 — 6)n rounds are performed, the algorithm stops. 



Since the array Aj contains 2 ^{m/2) different 2-'-tuples, the amount of 
memory required to maintain the status of all tuples is 

d-l 

2 



- 2~^ = (1 - 2-'')m <m- m^'^ 



j=0 

In addition, we keep an index for each Aj, holding its position among the £ 
blocks. By definition of d and i, this amounts to at most 

dlogg^ < (loggn)^ < m^/^ 

bits of memory, where the last inequality holds for any large n by (j4.3p . 

We first show that the algorithm does not fail to find a bin of Aj marked 
as empty. At any given point, each Aj has a fraction of at least 6/2 bins 
marked as empty. Hence, recalling (j4.2p . the probability of missing all the 
bins marked as empty in • • • i ^d-i is at most 



exp 



u / N \ ""'^ , r / x\ lOlogn /20\ 1 ^ 

2 " ' ) 2^7 J < exp [ - ( - - o(l) ) YTZ:^ log ( — ) - logs m 



6 ,^,\ km 

2 J 6 log2 m V (5 / 4 

<^-log{5/5)5/4~o(l) ^^-5/4 ^ (4.4) 



where the last inequality holds for large n. Therefore, whp the algorithm 
never fails to find an array Aj with an empty bin among the k choices. 
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It remains to show that, whenever the algorithm relocates an array Aj, 
there is always a fresh block available. 

By the above analysis, the probability that a ball is allocated in Aj for 
j > 1 at a given round is at most 



exp 



0(1) 



km/2 



n 



< 



exp 



o(r 



10 log n 
S log2 m 

A 



log 



20 \ 



< exp (-31og(5/5)j) = pj 



where the last inequality holds for any sufficiently large n. 

Let Nj denote the number of balls that were allocated in blocks of type j 
throughout the run of the algorithm. Clearly, Nj is stochastically dominated 
by a binomial random variable Bin(n,pj). Hence, known estimates for the 
binomial distribution (see, e.g., pj) imply that for all j, 

F{Nj > npj + C^/nlogn) < . 

The total number of blocks needed for Aj is at most 

2^'iV. 



(1 



5\m 
2' 2 



and hence the total number of blocks needed is whp at most 

n I log 



^ 2^(1 - 5)npj + C2^^\ogn 

j=0 



(1 



5 \m 
2' 2 



3=0 



6)npj 



1 5 \in 

^ 2' 2 



+ 



n 



m 



Since 



d-l 



d-l 



^ 2^Pj = ^ exp (j(log 2 - 3 log(5/5))) < 2 • 2(5/5)^ < 5/5 

(with room to spare), the total number of blocks needed is whp at most 
{l + 6/5){l-6)n . ^/n^/Mofj 



(1 



2' 2 



o 



m 



< 



n 



[m/2\ 



for any sufficiently large n. 



4.3. Final version of the algorithm. The main disadvantage in the in- 
termediate version of the algorithm is that the size of each Aj was fixed at 
m/2 bins. Since the resolution of each Aj is in 2-'-tuples, we are limited to 
at most log2 m arrays. However, the probability of missing all the arrays 
Aq, . . . ,Aii_i has to compete with n, hence the requirement that m would 
be polynomial in n. 

To remedy this, the algorithm uses arrays with increasing sizes, namely 2^ 
blocks for Aj. The resolution of each array is now in 4-' -tuples, i.e., tracking 
the status of Aj now requires at most 2-' [m/2j/4J' < m/2^+^ bits. Recalling 
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that d = [log2 ((jj log n)] , the number of memory bits required for all arrays 
is at most 

d-l 

— 2^^ = (1 - 2-'^)m <m- 0(m/ log n) . (4.5) 
i=o 

The following calculation shows that indeed there are sufficiently many 
blocks to initially accommodate all the arrays: 

(2'^-l)Lm/2j <-^mlogn<-^ = -n, 

where we used the assumptions k > (3/(5) logn and km = Cn. 

Each of the arrays comes along with a pointer to its starting block, and 
the total number of memory bits required for this is at most 

dlog2(2n/m) < (log2 log n + 0(1)) log2 n = (1 + o(l)) log2 n ■ log2 log n . 

When m = r2(log^n), the space for these pointers clearly fits among the 
0(m/logn) bits remaining according to (|4.5p . For smaller values of m, as 
before we can apply the algorithm for, say, m' = m/3 (after tripling the 
constant Cs to reflect this change), thus earning 2m/3 bits for the pointers 
(recall the requirement that m > log2 logn • log2 n). 

As final evidence that the choice of parameters for the algorithm is valid, 
note that each Aj indeed contains many 4-' -tuples. It suffices to check A^-i, 
which indeed comprises about 

(1 + = (1 + 0(1) W2' = f ? + oil)) ^ = J7(loglogn) 



2 ^ \ /J I \ ^ ^ V logn 

4'^~^-tuples, where the last equality is by the assumption on the order of m. 

It remains to verify that the algorithm succeeds whp. This will follow 
from the same argument as in the intermediate version of the algorithm. In 
that version, each Aj contained at least a fraction of (5/2) empty bins, and 
l^jl was about m/2 for all j. In the final version of the algorithm, each Aj 
again contains at least a fraction of {5/2) empty bins, but crucially, now Aj 
contains 2^ bins. Thus, recalling (j4.4p . the probability to miss Aq^ . . . , Ad-i 
in a given round is now at most 



exp 



5-°W)^E2lsexp(-(I-„(l))^(2^-l) 



j=0 



^-5/4-o(l) 



where the last inequality is by the definition of d. A union bound over the 
n rounds gives that, whp, an array Aj with an empty bin is found for every 
ball. 
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To see that whp there are always sufficiently many available fresh blocks 
to relocate an array, one essentially repeats the argument from the interme- 
diate version of the algorithm. That is, we again examine the probability 
that a ball is allocated in Aj, to obtain that this time 

p, = exp(-(l-o(l))^(2^ -l)) . 
A choice of C > K{l/5) log(l/(5) with some suitably large K > would give 

and the rest of that argument unchanged now implies that the algorithm 
never runs out of fresh blocks whp. 

This completes the proof of Theorem HI ■ 

4.4. Proof of upper bound in Theorem [T]. We now wish to apply the 
algorithm from Theorem H] in order to obtain a constant load in the case 
where km > cn for some c > 0. To achieve this, consider the perfect 
matching algorithm for, say, 5 = ^, and let Cs be the constant that appears 
in Theorem m Next, join every consecutive [C^/c] -tuple of bins together 
and write n' for the number of such tuples. As km > Cn' , we may apply 
the perfect-matching algorithm for n' /2 balls with respect to the n' tuples 
of bins, keeping in mind that the algorithm is valid also for the model of 
repetitions. This gives a perfect matching whp, and repeating this process 
gives a total load of at most 2Cs/c = 0(1) for all n balls. ■ 

5. Improved lower bounds for poly-logarithmic choices 

5.1. Proof of Theorem [2], Our proof of this case is an extension of the 
proof of Theorem [31 We now wish to estimate the number of g-collisions for 
general q: 

Co,,,, ^ t i^f) . 

The analysis hinges on a recursion on q, for which we need to achieve bounds 
on a generalized quantity, a linear function of the q-collisions vector: 

Sx<...<Sq<t i i ^ ^ 

Sl<...<Sq<t i 

Our objective is to obtain lower bounds for xf''^ with / = 1, as clearly 
Colq{t) = X^'". Notice that the parameters X^,Vl' from Section [3] are 
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exactly X^'^,Vl^'^ defined above. There, u was a strategy, whereas now 
our / will be the product of different strategies. This fact will allow us 
to formulate a recursion relation between the V/''^-s and an approximate 
recursion for the X^''^. We achieve this using the next lemma, where here 
in and throughout the proof we let 

L = log(n/m) (5.3) 

denote a maximal load we do not expect to reach (except if the algorithm 
is far from optimal). We further define 

^ = {Ui=ifi ■ /i G • • • ,/i2™} for allij 

to be the set of all point-wise products of at most L strategies from the pool. 

Lemma 5.1. Either the maximal load exceeds L, or the following holds for 
all q < L, every t < n/k and every / G F, except with probability q-^"^^ , 

If y/;? > 100^^^^m||/||oo then xf''' > S'" V/''^ . (5.4) 

Proof. The key property of the quantities v/''^, which justified the inclu- 
sion of the inner products with /, is the following recursion relation, whose 
validity readily follows from definition (j5.2p : 

y/'^+i = y W'^+i-/);? for any g > 1 and any t. (5.5) 

We now wish to write a similar recursion for the variables xf''^. As opposed 
to the variables V/''^, which satisfied the above recursion combinatorially, 
here the recursion will only be stochastic. Notice that 

x£r-x/-'=/M«)(("'''^';'l+')-(^^f«' 
=/««)("'»"' 

and hence 



E 



We may therefore apply Proposition 12 . II as follows: 

• The sequence of increments we consider is {X^^-^^ — x/'*^^^) (that 
results in a telescopic sum). 

• The sequence of conditional expectations is (Xj*^*^^ ■^^'''). 

• The bound on the increment is M = ||/||oo(g)) where L is an upper 
bound for the maximal load (if we encounter a load of L, we stop 
the process). 
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This implies that 



s<t s<t 



< exp ( + 2 ) = O (exp(-5mL)) . 



2OII/II00© 

As a result, the above event does not occur for any / S T (since there are at 
most 2"^^ such functions) except with probability e~^"^^. Therefore, setting 

^/;<7 = 100^^^m||/||oo , 
we have that, except with probability e"^"*^. 
If > S-^/i/jq then > . (5.6) 

s<t s<t 



We now proceed to prove (j5.4p by induction on q. For q = 1, notice that 



Furthermore, as the definition of X^'"^ also applies to the case g = 0, we 
obtain that 



^/^' = EE^^+i«/« = E^i''^"^- 



■/);0 

s<t i s<t 



Hence, combining the assumption V/'^ > 100(3L)^m||/||oo = hj-i with 
statement (j5.6p yields that x/'^ > ^V/'^ > ^V/'^, except with probabil- 
ity e"^™-^. 

It remains to establish the induction step. The induction hypothesis for q 
ates that whenever v/''^ > f 
probability e~^"^^. Therefore, 

j^Wii+r/);'? > 3-9 ^ YiQs+i-f);q . 

s<t ^ 

s<t 

>3-5/y^/;a+i_i.lOoi^^m||Q,+i-/|U) , (5.7) 



states that whenever v/''^ > hf-q we also have ^/''^ > 3 '^v/''^ except with 
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where in the last inequahty we apphed the recursion relation ()5.5p . Recalling 
that Qs-\-i is a strategy, the following holds for all t <n/k: 

t\\Qs+l ■ /Hoc < tllQ.+lllooll/lloo < t-ll/lloo < 

n 

Plugging this into ()5.7p we obtain that for all t < n/k, 



, -m 

1' 



s<t 



^^(Q.H-i./);. > 3-^(y/^'?+i - 100^ 

3-« (V/-^^^' - hf.,,) . (5. 



Now, if v/''^'^^ > hf-q+i = 100 ^(fi^\i ?7i||/||oo, then in particular 



yf;q+i > 3 . 100^ — L m||/||oo = 3hf.„ for all g < L - 1. 

q\ 

Thus, under this assumption, (jS.Sp takes the following form: 

s<t 

Since this satisfies the condition of (|5.6p (where we actually only needed a 
lower bound of 3'''^hf-q), we obtain that except with probability e~^"^^, 

> i Y^;^(Q3+i-/);g > 3-(q+l)T//;9+l 
t — 2 / ^ ^ — t ' 

completing the induction step. 

Summing the error probabilities over the induction steps for every q < L 
concludes the proof of the lemma. ■ 

It remains to apply the above lemma to deduce the maximal load of 
^( log'logn ) ^ ~ polylog(ri). Recalling that m < n^~^ for some fixed 
5 > 0, let < e < (5/2 and choose the following parameters: 

/- N log(n/m) p ^ 

9=1- e) . ■ f / N , f = l , t = n k. 

log fe + log log(n/mj 

Lemma |5. II now gives that, either the maximal load exceeds L = log (n/m), 
or whp the following statements holds: 

If V^'f^ > lOO^^^^m then X^%, > 3"'? V^jl . (5.9) 

n/k — q\ n/k — n/k ^ ' 

Notice that for the above value of g, we have = (n/m)°(^) , and therefore, 
showing that the condition of (j5.9p is satisfied and that > (n/m)^/^ 
would immediately imply that the maximal load exceeds q whp. 

The following lemma, which provides a lower bound on V^''^ ■, is thus the 
final ingredient required for the proof of the theorem: 
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Lemma 5.2. For all t, k and q, all Qi, . . . , Qt and any fixed a > we have 

,1., {t-{l+a)kqy 



39/(2a)n9-lg! 



Proof. Recall that 



1;? 



Sl<...<Sq<t i = l 
1 " 



j = l Sl<f 



S2<t 



Sq<t 
Sq^{si,...,Sq-l} 



Defining 



s<t 



and recalling that HQs I loo ^ ^/'^ for all s, it follows that for all i and j > 1, 

^ >r-,-(j-l)/c/n . 

Sj<t 
Sj=/={si,...,Sj-l} 

Consequently, 

^t'^" ^ - E n - - 



9! : 



n q 



Next, notice that for aW 1 < j < q — 1, 



>(l+„)ikzl)} 



(5.10) 



1 



{l + a){q-l) 



> exp 



> exp 



J 



(l + a)(g-l) 
3 



(l + a)((7-l) 



a(g - 1). 

Thus, in case rj > (1 + a) ^^'^ we have the following for all 1 < j < g: 



r,; > rill 



j-1 



n \ (l + a)(g-l) 

Combining this with ()5.10p . we deduce that 



> r j exp 



i-1 

a(g - 1) 



1 



n (? 



J-1 



a{q - 1) 



l{n>(l+a)Mlzi)} 



1 " 



-l/(2a) 



^il{r,>{l+a)M^} 
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Applying Cauchy-Schwartz, we infer that 

-l/(2a) 

'''■^{n>{i+a)M^} 



gg/(2a)^g-lg| 

The proof of the lemma now follows from noticing that 

2Z ^^l{r,<(l+a)MiLzl)} < (1 + o^Mq - 1) < (1 + Oi)kq , 
i 

whereas n = J2s<t J2i Qs{i) = t. ■ 

To complete the proof using Lemma I5.2( apply this lemma for a = 1, 
t = n/k, and kq = n°^^\ giving that 

where the last inequality holds for any sufficiently large n. Consequently, 

— — < 100(3L)'2+^(2A;)5— < 100(6A;L)«+^— . 

T/ n n 

n/k 

Since our choice of q is such that 
we have that 

N-£ + o(l) 



100(3L)9+im/g! ^ , ^ 
^ < (n/m) 



This implies both that V^j^, > [n/mY^'^ for any large n (recall that L > q), 
and that the condition of ()5.9p is satisfied for any large n. Altogether, the 
maximal load is whp at least q, concluding the proof of Theorem [2l ■ 

5.2. A corollary for non-adaptive algorithms. We end this section with 
a corollary of Theorem [2] for the case of non-adaptive algorithms, i.e. the 
strategies Qi, ■ ■ ■ ,Qn are fixed ahead of time. Namely, we show that for 
k = 0(^ '°iog°^" ) the optimal maximal load is whp Q( iog°iogn )' 
same order as the one for k = 1. Theorem [5l whose proof appears in 
Section [6l includes a different approach that proves this result more directly. 

Corollary 5.3. Consider the allocation problem ofn balls into n bins, where 
each ball has k independent uniform choices. If k < Cn then any 

non-adaptive algorithm whp has a maximal-load of at least ■ \og\^n ■ 

In particular, if k < n '"^'"^" then the load is at least (1 — o(l)) iog°ogn. whp. 
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Proof. Let Qi, . . . , Qn be the optimal sequence of strategies for the problem. 
Using definitions (jS.ip and (j5.2p with / = 1, we have the following for all q: 

^i;, ^ ^ fN,{r)\ ^ ^^^^^^ 
j ^ ^ 

y/'" = ex/'" . 

Fix < e < ^. Applying Lemma 15.21 with t = n and a = e/[2(l — e)], 

.,i:,y^ in-{l + a)kq)'^ ^/ {2 - e)kq y n 

''n - e9/(2")n9-ig! V 2{l-e)n) e^-^kAg! ' ^ ^ 

Recalling that k < Cn ^°fJ°^" for some fixed C > 0, set 

1 — e log n 



C V 1 log log n 

This choice has kq/n < 1 — e and ql < fi^-^+"W _ Combined with (j5.11|) 

e)kq'^ ,/ {2 — e)kq\\ n 



(5.12) 



ECo,,,„,..-.».(-|^/(:-|^)) 



g(l-e)g/£(^f 



n 



>^-P[-~-Y''l)^^T^ = n'-"''^. (5.13) 

To translate the number of g-collisions to the number of bins with load q, 
consider the case where for some bin j we have Yl^=iQsij) ^ 100 log n. 
Proposition 12.11 (applied to the Bernoulli variables 1| j^=j}) then implies 
that Nn{j) > 50 log n except with probability 0{n~^), and in particular the 
maximal load exceeds q whp. We may therefore assume from this point on 
that X;r=i QsU) < 100 log n for all j. 

Set L = 150 log n. Clearly, upon increasing Qs{j) for some 1 < s < n, the 
load in bin j will stochastically dominate the original one. Thus, for any 
integer r > 1 we may increase 'Yl^=iQs{j) to and by Proposition 12.11 
obtain that Nn{j) < rL except with probability 0(exp(— rL/30)). Defining 

Ar = irL < max Nn{j) < (r + l)L} (for r = 0, 1, . . .) , 

I l<j<n 

we in particular get V{Ar) = 0(nexp(— rL/30)). However, clearly on this 
event Col<;(n) < n{^'"^^^^), and since n2(('^+^i)^) < O(exp(rL/50)), we have 

E [Colq(n) I Tq] P (Aq) < ^O(exp(-rL/100)) = 0(n~3/2^ = o(l) . 

r>l 

Thus, by (j5.13p . we have E[Colg(n) | ^o] ^ n^""*-^-*. Finally, since any given 
bin can contribute at most (^) = n°^^^ collisions to Colg(n) given Aq, 

n 

e[E l{^n0-)>5}] > E [Co\{n)/Q I A,] P(^o) = n^'°(^) . 
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As demonstrated in the next section (see Lemma l6.2p . one can now use 
the fact that the events {Nn{j) > q} are negatively correlated to establish 
concentration for the variable l^"=i '^{Nn(j)>q}- Altogether, we deduce that 
the maximal load whp exceeds q, as required. ■ 

6. Tight bounds for non-adaptive allocations 

In this section we present the proof of Theorem [5l Throughout the proof 
we assume, whenever this is needed, that n is sufficiently large. To simplify 
the presentation we omit all floor and ceiling signs whenever these are not 
crucial. We need the following lemma. 

Lemma 6.1. Let pi,P2, ■ ■ ■ iPn be reals satisfying < pi < " for all i, 
such that Y17=iPi ^ 1 ~ ^; where e = e(n) G [0, 1]. Let Xi,X2, ■ ■ ■ ,Xn be 
independent indicator random variables, where ^{Xi = 1) = pi for all i, and 
put X = Y27=i -^i- ^^6n 

log n \ 1 



.(.T>(l-s)^)> 

V loeloffn/ 



l-e ■ 



log log n/ n 

Proof. Without loss of generality assume that pi > p2 > • • • > Pn- Define 
a family of k pairwise disjoint blocks i?2, ■ ■ ■ ,Bk C {1,2,..., n}, where 
k>{l- e)T^¥^ so that for each i, 1 < i < A;, 



2 log log n 



log n ^-^ log n 

This can be easily done greedily; the first block consists of the indices 
1,2, ...,r where r is the smallest integer so that 'Y^j^iPj > kjfn- Note 
that it is possible that r = 1, and that since the sequence pj is monotone 
decreasing, 'Yl^j=iPj — ^"log'n'^ ' Assuming we have already partitioned the 
indices {1, . . . , r} into blocks, and assuming we still do not have (1— e) ^^^^^^^ 
blocks, let the next block be {r + 1, . . . , s} with s being the smallest inte- 
ger exceeding r so that Yfj=r+iPj ^ ^o^^ ^^^^ if Pr+i > then 
s = r + 1, that is, the block consists of a single element, and otherwise 
Yli^j=rJriPj < < °iog°ra" ' Thus, in any case the sum above is at least 
and at most ^ . Since the total sum of the reals p,- is at least 1 — e 

log n log n 

this process does not terminate before generating /c > (1 — e) blocks, 
as needed. 

Fix a family of A; = (1 — e) ^^'^^^^ blocks as above. Note that for each 
fixed block Bi in the family, the probability that Ylj^Bi -^i ^ 1 is at least 
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It thus follows that the probability that for each of the k blocks Bi in the 
family X^jgBi -^j > 1 is at least (i^g^^)^ = completing the proof of the 
lemma. ■ 

Proof of Theorem \5[ Part (i) . As before, our framework is the relaxed 
model where there are strategies Qi,Q2, ■ ■ ■ , Qn, where Qt is the distribution 
of the bin to be selected for ball number t, satisfying ||oo < k/n. However, 
since now we consider non-adaptive algorithms, the strategies are no longer 
random variables, but rather a predetermined sequence. We therefore let 
P = {pit) denote the nxn matrix of probabilities, where is the probability 
that the ball at time t would be placed in bin i. Clearly 

< Pit 1^ k/n = — — - — for all i and t , 
log n 

and 

Pit = 1 for all t . 

l<i<n 

The sum of entries of each column of the n by n matrix pit is 1, and hence 
the total sum of its entries is n. If it contains a row i so that the sum of 
entries in this row is at least, say, log n, then the expected number of balls in 
bin number i by the end of the process is Y17=i Pn — variance 
is 

n n 

^Pit{l-Pit) < ^Pit , 
t+1 t=l 

it follows by Chebyshev's Inequality, (or by Hoeffding's Inequality) that with 
high probability the actual number of balls placed in bin number i exceeds 
^^W^ > iog°iog„ ) showing that in this case the desired result holds. 

We thus assume that the sum of entries in each row is at most log n. As 
the average sum in a row is 1, there is a row whose total sum is at least 1. 
Omit this row, and note that since its total sum is at most logn, the sum 
of all remaining entries of the matrix is still at least n — log n, and hence 
the average sum of a row in it is at least "~^°f " > 1 — . Therefore there 
is another row of total sum at least this quantity. Omitting this row and 
proceeding in this manner we can define a set of rows so that the sum in each 
of them is large. Note that as long as we defined at most , rows, the 
total sum of the remaining elements of the matrix is still at least n — 

^ log n ' 

and hence there is another row of total sum at least 1 — • We have thus 

log n 

shown that there is a set / of , "> rows such that 

log n 
n ^ 

/ Pit > 1 — for each i £ I . 

^-^ logn 
t=i ^ 
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For each i £ I, let Ai denote the event 

Ai = There are at most ( i^gf^^ — 4) bahs in bin i 

Applying Lemma 16.11 with e = ^ ^"□g'^'^ " we get 

loE^ n 

^(Ai) < 1 — for each iel . 

n 

We win next show that, as the events Ai are negatively correlated, the 
probability that all of these events occurs is at most the product of these 
probabilities (which is negligible). 

Lemma 6.2. Define the events {Ai : i £ 1} as above. Then 
P( Hies A) < n^(^^) '^^y S Cl . 

Proof. The proof proceeds by induction on l^l. For the empty set this is 
trivial, and we will prove that for every set S and j £ I \ S 

P( Ai n Aj) < P( Di^s A,)¥{Aj) . 

Define the following independent random variables for every time t: 

FiBt = 1) = ptj , FiBt = 0) = 1 - ptj , 

¥{Ht = i) = — — — for each i ^ j ■ 
1 - Ptj 

We may now define J^, the position of the ball at time t, as a function of Bt 
and such that indeed P(Jt = i) = pu for all i: 

j Bt = l, 
Ht Bt = 0. 

Crucially, the event Aj depends only on the values of {Bt}, and is a mono- 
tone decreasing in them. Further notice that the function 

. . . , 6„) = P( n,g5 Ai\Bi = bi,...,Bn = bn) 

is monotone increasing in the bi-s. Therefore, applying the FKG-inequality 
(see, e.g., [U Chapter 6] and also [161 Chapter 2]) on y = f(Bi,...,Bn) 
and lAj gives 

P( Hi^s Ai n Aj) = E[Y1a,] < nY]F{Aj) = P( Hi^s Ai)F{Aj) , 
as required. ■ 

Altogether, we obtain that the probability that all of the bins with indices 
in / have at most ( ^^j^f - 4) balls 



Jt 



IS 



log log n 

log^n\"/l°g^'^ _,„„2 



ni^iA,)<(l-^—) <e 



log rt 
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This completes the proof of Part (ji]) . ■ 

Proof of Theorem [5], Part (ii) . It is convenient to describe the proof of 
this part for a shghtly different model instead of the one considered in the 
previous sections. Namely, in the variant model, in every round each bin 
among the n bins is chosen randomly and independently as one of the options 
with probability a. By Chernoff's bounds, our results in this model will 
carry into the original one, since obtaining an uniform bins is dominated by 
getting each bin independently with probability (1 + e)a, and dominates a 
probability of (1 — e)a for each bin. 

For the simplicity of the notations, we will henceforth consider the case 
k = n/2, noting that our proofs hold for k = an with any < a < 1 fixed. 

As noted in the Introduction, the relaxed model of strategies Qt such 
that 1 1 I loo ^ k/n is stronger than the model where there are k uniform 
options for bins. In fact, the results of this part (an optimal maximal load of 
order yTogn) do not hold for the relaxed model. For instance, if Qt assigns 
probability k/n = ^ to i = t and i = {t + 1) (with the indices reduced 
modulo n), the maximum load will be at most 2. 

However, it is easy to see that in fact each strategy Qt is more restricted. 
Indeed, the total probability that Qt can assign to any r bins does not exceed 
1 — 2~^', as for each fixed set / of r bins, the probability that none of the 
members of / is an optional choice for ball number t is . 

We start with the simple proof of the upper bound, obtained by the 
natural algorithm which places the ball in round t in the first possible bin 
(among the k given choices) that follows bin number t in the cyclic order of 
the bins. 

Lemma 6.3. There exists a non-adaptive strategy ensuring that, whp, the 
maximum load in the above model is at most 0{\/logn). 

Proof. Order the bins cyclically bi,b2, ■ ■ ■ ,bn, bn+i = bi. For each round t, 

1 < t < n, place the ball number t in the first possible bin bi that follows 
bt in our cyclic order and is one of the given options for this round. Note, 
first, that the probability that the ball in round t is placed in a bin whose 
distance from bt exceeds 2 log n, is precisely the probability that none of the 

2 log n bins following bt is chosen in round t, which is 

2-2 log n ^ ^-5/4 ^ 

Therefore, with high probability, this does not happen for any t. In addition, 
the probability that a fixed bin bi gets a load of \/log n from balls placed in 
the 2 log n — 2^J\ogn rounds {i — 2 log n + 1, i — 2 log n + 2, . . . , i — 2\/logn}, 
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does not exceed 

/2 log n — 2 A/log n\ / 1 \ Vlogn l 
V V\og^ ) \2'^y^) ~ n2~o(i) 

Indeed, for each fixed value of t S [i — 21ogn + l,i — 2-y/log nj, if the ball 
placed in round number t ends in bin number i, then none of the 2-v/logn 
bins preceding 6j is chosen as an optional bin for ball number t, and the 
probability of this event is 2~'^^^°^^. There are ") possibilities 

to select \/log n rounds in the set {i—2 log n+1, i—2 log n+2, . . . , i— 2-v/logn}, 
and as the choices of options for each round are independent, the desired 
estimate follows. 

We conclude that with high probability no bin bi gets any balls from round 
t with t < i — 2 log n, and no bin bi gets more than \/log n balls from rounds t 
with i — 2 log n < t < i — 2^/Togn. As bi can get at most 2^/\og n balls from all 
other rounds t, (as there are only 2^J\og n such rounds), it follows that with 
high probability the maximum load does not exceed 2>^J\ogn, completing 
the proof of the lemma. Note that it is easy to improve the constant factor 
3 in the estimate proved here, but we make no attempt to optimize it. ■ 

We proceed with the proof of the lower bound. As in the proof of Part ([!]) 
of the theorem, let P = {pn) be the nxn matrix of probabilities correspond- 
ing to our non-adaptive strategy, where pa is the probability that the ball 
in round t will be placed in bin number i. Recall that for each fixed round 
t, the sum of the largest r numbers pa cannot exceed 1 — 2"'". This fact will 
be the only property of the distribution pn used in the proof. 

Call an entry pa of the matrix P large if pa > 2"^'°^", otherwise, pa 
is small. Call a column t oi P concentrated if it has at least large 
elements. We consider two possible cases. 

• Case 1: There are at least n/2 concentrated columns. 

In this case, there are at least " large entries in P. If there is a row, 
say row number i of P, containing at least, say, 2^^^°^^ large entries, then 
the expected number of balls in the corresponding bin is Yl^=i Pn > 2^^°s"', 
and, as the variance of this quantity is smaller than the expectation, it 
follows that in this case with high probability this bin will have a load that 
exceeds f7(2v^) > ^log^- We thus assume that no row contains more 
than 22^^°s" large elements. Therefore, there are at least „ " = n^~°^^^ 

rows, each containing at least, say, ^^g^" large elements. Indeed, we can 
select such rows one by one. As long as the number of selected rows does 
not exceed , the total number of large elements in them is at most n, 

and hence the remaining rows still contain at least "v^^°g " _ > "v^*°s" 
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large elements, implying that there is still another row containing at least 
^^g^'^ large elements. 

Fix a bin corresponding to a row with at least ^'g^" large elements, and 

fix ^'g^" of them. The probability that all balls corresponding to these large 
elements will be placed in this bin is at least 

1 \^ 1 



As there are at least n^~°^^^ such bins, and the events of no large load in 
distinct bins are negatively correlated (see Lemma 16. 2p , we conclude that 

8 



the probability that none of these bins has a load at least g^ " is at most 



1 Nni-°(i) 

1-^) =o(l) 



showing that in this case the maximum load is indeed ^}{^/Togn) with high 
probability. 

• Case 2: There are less than n/2 concentrated columns. 

In this case, the sum of all small entries of the matrix P is at least 

n 



2 . 20-5v'logri ' 

since each of the n/2 non-concentrated columns has less than 0. Si/log n large 
elements, and hence in each such column the sum of all small elements is at 
least 2"0-5v^. 

Call a small entry p = pij of P an entry of type r (where \/Togn < r < 

21ogn), if 27qT- < P < ^F- Since the sum of all entries of P that are smaller 

than 2^^'°s" = l/n? is at most 1, there is a value of r in the above range, 

so that the sum of all entries of P of type r is at least 

n n 
> 



41ogn • 20-5v'iogn 20-'^5Vlogn 
Put 

^ A 20.75^1^ ^ 

and note that there are at least entries of type r in P (since otherwise 
their total sum cannot be at least n/x). We now restrict our attention to 
these entries. 

We can assume that there is no row containing more than 2^"*"^ log n of 
these entries. Indeed, otherwise the expected number of balls in the corre- 
sponding bin is at least logn, the variance is smaller, and hence by Cheby- 
shev with high probability the load in this bin will exceed r2(logn) > y/logn. 
We can now apply again our greedy procedure and conclude that there are 
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at least ^ = rows, each containing at least |j entries of type r; 

indeed, a set of less than ^ rows contains a total of at most 

logn — 



4x log n 2x 

2x 



elements of type r, leaving at least such elements in the remaining rows, 



and hence ensuring the existence of an additional row with at least |^ such 
entries. 

Fix a bin corresponding to a row with at least |^ entries of type r. The 
probability that exactly t balls corresponding to these entries will be placed 
in this bin is at least 

f)(^)'(i-i)^.Q'(^)'(i-^) >>.)-'. 

for t = i/log n the last quantity is at least 

1 ^4 . 2O.75v^0^r'^ = „-3/4-o(l) 



n = n 



2 

This, the fact that there are n^~°^^^ such rows, and the negative correlation 
implies that in this case, too, with high probability there is a bin with load 
at least Q{^/^ogn). This completes the proof of Part ^ of the theorem. ■ 

7. Concluding remarks and open problems 

• We have established a sharp choice-memory tradeoff for achieving a 
constant maximal load in the balls-and-bins experiment, where there 
are n balls and n bins, each ball has k uniformly chosen options for 
bins, and there are m bits of memory available. Namely: 

1. If km = Q(n) for k = O(logn) and m = r2(lognloglogn), then 
there exists an algorithm that achieves an 0(1) maximal load whp. 

2. If km = o{n) for m = ^(logn), then any algorithm whp creates 
an unbounded maximal load. For this case we provide two lower 
bounds on the load: 1^ (log log (^)) and (l+o(l)) 

• In particular, if m = n}^^ for some 5 > fixed and 2 < k < polylog(n), 
we obtain a lower bound of 0( i ) on the maximal load. That is, 

log log n ' ' 

the typical maximal load in any algorithm has the same order as the 
typical maximal load in a random allocation of n balls in n bins. 

• Given our methods, it seems plausible and interesting to improve the 
above lower bounds to (1 + o(l)) ^^ ^ , analogous to the load of 

(1 + o(l)) in a completely random allocation. 

• Note that, when km = n^~^ for some fixed S > 0, even the above 
conjectured lower bound is still a factor of 5 away from the upper bound 
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given by a random allocation. It would be interesting to close the gap 
between these two bounds. Concretely, suppose that km = ^Jn\ can 
one outperform the typical maximal load in a random allocation? 

• To prove our main results, we study the problem of achieving a perfect 
allocation (one that avoids collisions, i.e., a matching) of (1 — 8)n balls 
into n bins. We show that there exist constants C > c > such that: 

1. If km > Cn for k = J7(logn) and m = ^}{logn ■ log log n), then 
there exists an algorithm that achieves a perfect allocation whp. 

2. If km < cn for m = J7(logn), then any algorithm creates ^}{n) 
collisions whp. 

• In light of the above, it would be interesting to show that there exists 
a critical c > such that, say for k,m > log^ n, the following holds: 
If km > (c + o(l))n then there is an algorithm that achieves a perfect 
allocation whp, whereas if km < (c — o(l))?i then any algorithm has 
r2(n) collisions whp. 

• The key to proving the above results is a combination of martingale 
analysis and a Bernstein-Kolmogorov type large deviation inequality. 
The latter. Proposition 12. 11 relates a sum of a sequence of random vari- 
ables to the sum of its conditional expectations, and crucially does not 
involve the length of the sequence. We believe that this inequality may 
have other applications in Combinatorics and the analysis of algorithms. 

• We also analyzed the case of non-adaptive algorithms, where we showed 
that for every k = 0[n ^°f^°f^^ ) , the best possible maximal load whp is 
^( log'iogn )' same as in a random allocation. For k = an with 
< a < 1, we proved that the best possible maximal load is @{y/\ogn). 
Hence, one can ask what the minimal order of k is, where an algorithm 
can outperform the order of the maximal load in the random allocation. 
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